fix(cache): delay history image pruning to preserve prompt cache prefix#3
Open
fix(cache): delay history image pruning to preserve prompt cache prefix#3
Conversation
pruneProcessedHistoryImages was stripping image blocks from every already-answered user turn on each run. Turn N sends image bytes → provider caches the prefix. Turn N+1 replaces image with text marker → bytes diverge at that message → cache miss from there onward. Now only prune images older than 3 assistant turns. Recent history stays byte-identical so the cached prefix survives, while legacy sessions with persisted image payloads still get cleaned up.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
pruneProcessedHistoryImagesreplaces image blocks with[image data removed - already processed by model]text for every user/toolResult message before the last assistant turn, on every run.This defeats prompt caching for any conversation that includes images.
Fix
Only prune images older than 3 assistant turns (
PRESERVE_RECENT_ASSISTANT_TURNS). Recent history stays byte-identical so the cached prefix survives across turns, while legacy sessions with persisted image payloads still get cleaned up once they age out.The original purpose — per the doc comment — was "idempotent cleanup for legacy sessions", so aggressive per-turn pruning was never needed.
Verification
PRUNED_HISTORY_IMAGE_MARKERappearing at turn N+1 (only referenced in this file + test)applyPiAutoCompactionGuard), not by this prune — keeping ~3 turns of images is a small delta vs. the previous 0-1state-migrations.ts) still exists, so this cleanup remains load-bearing for migrated sessionsTests
6 tests pass including two new cases:
keeps image blocks within the last 3 assistant turns to preserve prompt cacheprunes only old images while preserving recent ones🤖 Generated with Claude Code